Introduce a bootstrap command to auto-install packages #650

owais · 2020-05-05T12:51:34Z

This commit introduces a new boostrap command that is shipped as part of
the opentelemetry-auto-instrumentation package. The command detects
installed libraries and installs the relevant auto-instrumentation
packages.

ocelotl

Blocking comment on having hardcoded references to existing instrumentations. Also, please refrain from any single use variable or function, it makes it pretty hard to review code when you have to be jumping around the file.

ocelotl · 2020-05-05T22:04:29Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+instrumentations = {
+    "dbapi": "opentelemetry-ext-dbapi>=0.6b0",
+    "flask": "opentelemetry-ext-flask>=0.6b0",
+    "grpc": "opentelemetry-ext-grpc>=0.6b0",
+    "requests": "opentelemetry-ext-http-requests>=0.6b0",
+    "mysql": "opentelemetry-ext-mysql>=0.6b0",
+    "psycopg2": "opentelemetry-ext-psycopg2>=0.6b0",
+    "pymongo": "opentelemetry-ext-pymongo>=0.6b0",
+    "pymysql": "opentelemetry-ext-pymysql",
+    "redis": "opentelemetry-ext-redis",
+    "sqlalchemy": "opentelemetry-ext-sqlalchemy",
+    "wsgi": "opentelemetry-ext-wsgi>=0.6b0",
+}


We should not have hardcoded references to existing instrumentations. This list should be compiled from the available packages that implement the opentelemetry-instrumentor entry point.

is there a way to get this information at runtime or do you want to make it a build step and just automate generation of this list?

Maybe this can help. Let me know if it does, please 🙂

Not sure if it does. This can help me get a list of installed instrumentations at runtime right?

This script is supposed to be used to install the required instrumentation packages. If we were to automate this, I think the way to do it would be to create a dev script that walks over the ext packages, figures out which ones are auto-instrumentation packages, then figures out which libraries they instrument and then generate a .py file for bootstrap.py to import and use. Even if we do that, in my experience working with a similar command for SignalFx's Python instrumentation, we might need to pin specific versions of different instrumentation in different versions of the bootstrap script so it might not always be feasible to auto-generate such a list without adding some "hard-coded" annotations to instrumentations or bootstrap script.

This is a bit of a tricky problem. @owais can you file an issue to see if we can track a more automated way to do this, or just maybe include this in documentation for auto-instrumentation?

There's merit in hardcoding as well: we don't add magic auto-instrumentation until we've tested things sufficiently.

ocelotl · 2020-05-05T22:15:58Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    )
+
+
+def _pip_check():


Can we please refrain from single use functions? It forces the reader to look back in the file while reading this code.

I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.

ocelotl · 2020-05-05T22:20:01Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+_OUTPUT_INSTALL = "install"
+_OUTPUT_REQUIREMENTS = "requirements"


Please move these module-level constants inside run since they are only used there.

ocelotl · 2020-05-05T22:26:23Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    cmd = {
+        _OUTPUT_INSTALL: _run_install,
+        _OUTPUT_REQUIREMENTS: _run_requirments,
+    }[args.output]
+    cmd(_find_installed_libraries())


Suggested change

cmd = {

_OUTPUT_INSTALL: _run_install,

_OUTPUT_REQUIREMENTS: _run_requirments,

}[args.output]

cmd(_find_installed_libraries())

{

_OUTPUT_INSTALL: _run_install,

_OUTPUT_REQUIREMENTS: _run_requirments,

}[args.output](_find_installed_libraries())

I had that but it was just too much going on in a single line :)

ocelotl · 2020-05-05T22:27:50Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    return library in sys.modules or pkgutil.find_loader(library) is not None
+
+
+def _find_installed_libraries():


Single use function here

ocelotl · 2020-05-05T22:30:19Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    pip_check = check_pipe.communicate()[0].decode()
+    pip_check_lower = pip_check.lower()


Suggested change

pip_check = check_pipe.communicate()[0].decode()

pip_check_lower = pip_check.lower()

pip_check_lower = check_pipe.communicate()[0].decode().lower()

owais · 2020-05-05T22:47:44Z

Single use functions were used because they are all system calls and can be more easily substituted if needed for example, can be mocked in tests. Making system calls inline makes mocking harder and less clear. I think it is worth it wrapping system calls in their own functions but happy to re-consider.

ocelotl · 2020-05-05T23:08:19Z

Single use functions were used because they are all system calls and can be more easily substituted if needed for example, can be mocked in tests. Making system calls inline makes mocking harder and less clear. I think it is worth it wrapping system calls in their own functions but happy to re-consider.

Ok, making things easier to mock in tests is actually a good reason for single use functions, let's just try to use them only when strictly necessary 👍

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

mauriciovasquezbernal · 2020-05-08T13:46:07Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    "mysql": "opentelemetry-ext-mysql>=0.6b0",
+    "psycopg2": "opentelemetry-ext-psycopg2>=0.6b0",
+    "pymongo": "opentelemetry-ext-pymongo>=0.6b0",
+    "pymysql": "opentelemetry-ext-pymysql",


Why some of them are missing the version?

They hadn't been published on pypi yet.

mauriciovasquezbernal · 2020-05-08T14:01:37Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    parser.add_argument(
+        "-o",
+        "--output",
+        choices=[_OUTPUT_INSTALL, _OUTPUT_REQUIREMENTS],
+        default=_OUTPUT_INSTALL,
+        help="""
+        install - uses pip to install the new requirements using to the currently active site-package.
+        requirements - prints out the new requirements to stdout. Output can be piped and appended to
+                       a requirements.txt file.
+        """,
+    )


Do we really need to have the intermediate --output command?
opentelemetry-bootstrap -o install looks weird to me.

What about removing it and having something like the following?

opentelemetry-bootstrap --install opentelemetry-bootstrap --requirements

Btw, should a call without arguments (opentelemetry-bootstrap) perform the installation? I got surprised by it, I run opentelemetry-bootstrap to see what was it about and suddenly it was performing a lot of installations.

I assumed we might have a more actions in future like updating requirements.txt file in place or support for other tools like pipenv, poetry, etc. Would --action/-a be a better name for the flag instead of --output/-o ?

Replaced with --action/-a instead and change the default to printing requirements.

mauriciovasquezbernal · 2020-05-08T14:08:00Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency
+    (e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade.
+    Using --no-dependencies alone would leave potential for nonfunctional installations.


Would you please mind wrapping these long lines?

mauriciovasquezbernal · 2020-05-08T14:22:32Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+            logger.info(
+                "Existing %s installation detected.  Uninstalling.", package
+            )
+            _pip_uninstall(package)


I don't know if we should uninstall packages, or at least, if it should be the default behavior without a flag for that. I can imagine an use case where the user has a working setup but he's missing the instrumentation for some library, he tries opentelemetry-bootstrap and boom, all his instrumentation packages are updated, this could break this setup.

I think in this case we should just print a message telling the user that the instrumentation is already installed and not actions will be performed.

I also noticed that this logic is uninstalling the instrumentation even if the version to be installed is the same one.

Reason to uninstall is that pip does not update packages with pip install for some sources github being one. So if a dependency installs directly from github, pip install does not update to newest version. We are not referencing any packages to be installed directly from github right now and not sure if at least the production ever will. We can remove this if everyone thinks we'll always be referencing pypi packages only.

he tries opentelemetry-bootstrap and boom, all his instrumentation packages are updated, this could break this setup.

Well, the intention of the command is to install/update the dependencies. We can make dry-run the default action for the command and print the dependencies, and force users to issue --install flag to actually make changes to site-packages like you suggested in another comment.

I've made printing the requirements the default action. LMK what you think about that.

mauriciovasquezbernal · 2020-05-08T14:40:55Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    )
+
+
+def _pip_check():


I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.

mauriciovasquezbernal · 2020-05-08T14:44:41Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency
+    (e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade.


I think we need to clarify it in a broader way. Do we want instrumentations to have install dependencies on the instrumented libraries?
If a user does pip install opentelemetry-ext-flask, should it update/install the Flask version installed?

I don't think they should. It causes more problems than it solved but it is what we do right now. Examples https://github.com/open-telemetry/opentelemetry-python/blob/master/ext/opentelemetry-ext-flask/setup.cfg#L43 https://github.com/open-telemetry/opentelemetry-python/blob/master/ext/opentelemetry-ext-django/setup.cfg#L41

There is a discussion regarding hardcoding, hardcoding may end up being the preferred approach. Dismissing to unblock this while the discussion is ongoing.

lzchen · 2020-05-12T17:53:47Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+# relevant instrumentors and tracers to uninstall and check for conflicts for target libraries
+libraries = {
+    "dbapi": ("opentelemetry-ext-dbapi",),
+    "flask": ("opentelemetry-ext-flask",),


Now that opentelemetry-ext-django is merged, can you add django as well?

lzchen · 2020-05-12T18:02:51Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+
+_ACTION_INSTALL = "install"
+_ACTION_REQUIREMENTS = "requirements"
+


It would be preferred to have a description of what this module does here as well as in the README so users will know what is going on and being installed automatically.

owais · 2020-05-18T15:13:51Z

Thanks @ocelotl.

@open-telemetry/python-approvers PTAL. I've fixed issues raised by most of the comments or replied to them. Working on add docs for the new command now.

owais · 2020-05-22T10:57:55Z

Took care of the remaining comments and updated autoinstrumentation package docs. PTAL.

codeboten · 2020-05-26T18:11:29Z

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py

+    "dbapi": "opentelemetry-ext-dbapi>=0.8.8",
+    "django": "opentelemetry-ext-django>=0.8.8",
+    "flask": "opentelemetry-ext-flask>=0.8.8",
+    "grpc": "opentelemetry-ext-grpc>=0.8.8",
+    "requests": "opentelemetry-ext-requests>=0.8.8",
+    "jinja2": "opentelemetry-ext-jinja2>=0.8.8",
+    "mysql": "opentelemetry-ext-mysql>=0.8.8",
+    "psycopg2": "opentelemetry-ext-psycopg2>=0.8.8",
+    "pymongo": "opentelemetry-ext-pymongo>=0.8.8",
+    "pymysql": "opentelemetry-ext-pymysql>=0.8.8",
+    "redis": "opentelemetry-ext-redis>=0.8.8",
+    "sqlalchemy": "opentelemetry-ext-sqlalchemy>=0.8.8",
+    "wsgi": "opentelemetry-ext-wsgi>=0.8.8",


Are the version numbers correct here? I would expect them to be 0.8b0

No idea how they crept in. I think search/replace command messed them up. Fixed now. Thanks.

I'm still seeing 0.8.8 unless im missing something

My bad. Should be fixed now. The push didn't go through last time.

codeboten

Just blocking on the question of versions, otherwise this looks great!

codeboten

LGTM 👍

codeboten · 2020-05-26T22:19:49Z

@owais please update the changelog, then i'll merge it

This commit introduces a new boostrap command that is shipped as part of the opentelemetry-auto-instrumentation package. The command detects installed libraries and installs the relevant auto-instrumentation packages.

owais · 2020-05-27T09:18:53Z

Updated. @codeboten

owais force-pushed the bootstrap-cmd branch 6 times, most recently from a85749a to 4d20855 Compare May 5, 2020 17:24

owais marked this pull request as ready for review May 5, 2020 17:38

owais requested a review from a team May 5, 2020 17:39

owais force-pushed the bootstrap-cmd branch 3 times, most recently from 57b9a53 to 4ecd50e Compare May 5, 2020 18:24

owais mentioned this pull request May 5, 2020

Provide easy way to install relevant instrumentation #606

Closed

owais force-pushed the bootstrap-cmd branch 5 times, most recently from 0bd266d to 25bbf76 Compare May 5, 2020 20:31

ocelotl previously requested changes May 5, 2020

View reviewed changes

mauriciovasquezbernal reviewed May 7, 2020

View reviewed changes

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py Outdated Show resolved Hide resolved

mauriciovasquezbernal reviewed May 7, 2020

View reviewed changes

opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py Outdated Show resolved Hide resolved

mauriciovasquezbernal reviewed May 8, 2020

View reviewed changes

owais force-pushed the bootstrap-cmd branch from ef1eb61 to e2c237d Compare May 11, 2020 11:26

owais mentioned this pull request May 11, 2020

How/where should bootstrap command get the list of auto-instrumentation packages from? #668

Closed

owais force-pushed the bootstrap-cmd branch from e2c237d to 069aff5 Compare May 11, 2020 12:56

ocelotl self-requested a review May 11, 2020 19:11

lzchen reviewed May 12, 2020

View reviewed changes

owais force-pushed the bootstrap-cmd branch 2 times, most recently from 8c07b89 to d016505 Compare May 18, 2020 14:52

codeboten requested review from mauriciovasquezbernal, lzchen and toumorokoshi May 20, 2020 22:29

owais force-pushed the bootstrap-cmd branch 6 times, most recently from dd8784f to 78784bb Compare May 22, 2020 10:56

owais force-pushed the bootstrap-cmd branch 4 times, most recently from d181b78 to ab53d36 Compare May 26, 2020 09:38

lzchen approved these changes May 26, 2020

View reviewed changes

owais force-pushed the bootstrap-cmd branch from ab53d36 to e2af30b Compare May 26, 2020 17:06

codeboten reviewed May 26, 2020

View reviewed changes

codeboten suggested changes May 26, 2020

View reviewed changes

owais force-pushed the bootstrap-cmd branch 2 times, most recently from 0397b54 to 1823ff2 Compare May 26, 2020 21:48

codeboten approved these changes May 26, 2020

View reviewed changes

Introduce a bootstrap command to auto-install packages

4d82d94

This commit introduces a new boostrap command that is shipped as part of the opentelemetry-auto-instrumentation package. The command detects installed libraries and installs the relevant auto-instrumentation packages.

owais force-pushed the bootstrap-cmd branch from cf5f696 to 4d82d94 Compare May 27, 2020 09:18

codeboten merged commit 334a534 into open-telemetry:master May 27, 2020

owais deleted the bootstrap-cmd branch May 27, 2020 16:18

		_OUTPUT_INSTALL = "install"
		_OUTPUT_REQUIREMENTS = "requirements"

		return library in sys.modules or pkgutil.find_loader(library) is not None


		def _find_installed_libraries():

		pip_check = check_pipe.communicate()[0].decode()
		pip_check_lower = pip_check.lower()

	pip_check = check_pipe.communicate()[0].decode()
	pip_check_lower = pip_check.lower()
	pip_check_lower = check_pipe.communicate()[0].decode().lower()

		OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency
		(e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade.


		_ACTION_INSTALL = "install"
		_ACTION_REQUIREMENTS = "requirements"

Introduce a bootstrap command to auto-install packages #650

Introduce a bootstrap command to auto-install packages #650

Conversation

owais commented May 5, 2020

ocelotl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owais commented May 5, 2020

ocelotl commented May 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owais commented May 18, 2020

owais commented May 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codeboten left a comment

Choose a reason for hiding this comment

codeboten left a comment

Choose a reason for hiding this comment

codeboten commented May 26, 2020

owais commented May 27, 2020